Shaping in Reinforcement Learning by Changing the Physics of the Problem
نویسنده
چکیده
Children learn to ride a bicycle by using training wheels. They are actually trying to learn one task (riding without training wheels) by training another one. In general, solving a difficult problem can be facilitated by training other problems. This is the basic idea of shaping. It is essential to ensure that spending time on the modified task will help solving the original one. In this paper we prove that given a finite MDP with a limited reward signal and γ < 1, we are guaranteed that if a series of tasks converge to the original one then the optimal value function converges to the original one as well.
منابع مشابه
Multicast Routing in Wireless Sensor Networks: A Distributed Reinforcement Learning Approach
Wireless Sensor Networks (WSNs) are consist of independent distributed sensors with storing, processing, sensing and communication capabilities to monitor physical or environmental conditions. There are number of challenges in WSNs because of limitation of battery power, communications, computation and storage space. In the recent years, computational intelligence approaches such as evolutionar...
متن کاملA Multiagent Reinforcement Learning algorithm to solve the Community Detection Problem
Community detection is a challenging optimization problem that consists of searching for communities that belong to a network under the assumption that the nodes of the same community share properties that enable the detection of new characteristics or functional relationships in the network. Although there are many algorithms developed for community detection, most of them are unsuitable when ...
متن کاملUsing Reinforcement Learning to Make Smart Energy Storage Source in Microgrid
The use of renewable energy in power generation and sudden changes in load and fault in power transmission lines may cause a voltage drop in the system and challenge the reliability of the system. One way to compensate the changing nature of renewable energies in the short term without the need to disconnect loads or turn on other plants, is the use of renewable energy storage. The use of ener...
متن کاملOperation Scheduling of MGs Based on Deep Reinforcement Learning Algorithm
: In this paper, the operation scheduling of Microgrids (MGs), including Distributed Energy Resources (DERs) and Energy Storage Systems (ESSs), is proposed using a Deep Reinforcement Learning (DRL) based approach. Due to the dynamic characteristic of the problem, it firstly is formulated as a Markov Decision Process (MDP). Next, Deep Deterministic Policy Gradient (DDPG) algorithm is presented t...
متن کاملDynamic Obstacle Avoidance by Distributed Algorithm based on Reinforcement Learning (RESEARCH NOTE)
In this paper we focus on the application of reinforcement learning to obstacle avoidance in dynamic Environments in wireless sensor networks. A distributed algorithm based on reinforcement learning is developed for sensor networks to guide mobile robot through the dynamic obstacles. The sensor network models the danger of the area under coverage as obstacles, and has the property of adoption o...
متن کامل